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ABSTRACT 

This Graduate Record Examination (GRE) study 
assesses: (1) the relative contribution of a vocabulary score 
(consisting of GRE General Test antonyms and analogies) and a reading 
comprehension score (consisting of GRP sentence coiapletion and 
reading comprehension sets) to the prediction of self -reported 
undergraduate grade point average ( GPA ) ; and (2) crite;rion-related 
validity patterns for item-type part scores on the GRE quantitative 
and analytical measures. Data from GRE files for 9,375 examinees in 
12 fields of study representing 437 undergraduate departments from 
149 colleges and universities were standardized within each 
undergraduate department, and then pooled for analysis by field. 
There were differences by -ma j or field in average per*.ormance on the 
various item-type part scores within each test. The reading 
comprehension subtest carried most of the predictive load in the GRE 
verbal measure. Item-type part scores on the other measures also 
exhibited differential patterns of relationships with the 
self -reported undergraduate grade point average. The findings suggest 
that the different item types within the respective broad ability 
measures may be tapping ^iomewhat ynigue skills and^abilit ies and that 
further exploration of their potential contribution is in order. 
(Author/BS) 
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Abstract 



This study vas undertaken (a) Co assess the relative contribution of a 
vocabulary score made up of GRE General Test antonyms and 'analogies and a 
reading coapreheoslon score siade up of GRE sentence completions and reading 
comprehension sets to prediction of an academic criterion (self-reported 
undergraduate gradi^ point average) and (bX to ' assess patterns of 
criterion-related validity for item-type part scores oh the GRE quantitative 
and analytical measures as well* 

The stud^ i^as based on data from GRE files for 9,375 examinees In 12 
fields of study representing 437 undergraduate departments from 149 colleges 
and universities* All data v<^re standardized within each undergraduate 
department and then pooled for analysis by field. 

There were differences by mi jor field in average performance on the 
various item-type part scores within each test. The reading comprehension 
subtest was found to carry most of the predictive load in the GRE verbal 
measure (consistent with findings for the reading comprehension subscore on 
the SAT verbal measure). Item--type p^rt scores on the other measures also 
exhibited differential patterns of relationships with the self-reported 
tiridergraduate grade point average. 

The findings suggest that the different item types within the respective 
broad ability measures may be tapping somewhat unique skills and abilities 
and that further exploration of their potential contribution is in order. 
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to Unilergraduate Grades 



Kenneth V^on 
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Introduction 

The GRE General (Aptitude) Test provides sieasures of developed verbal, 
.quantitative, and analytical abilitieti.* Only total verbal, quantitative, 
and analytical scores are reported. However, the three aeas'ures include^ 
different types of items that are thought of as. being different methods of 
■easuriiig their respective constructs (Rock, Uerts, & Grandy, 1982). 

The verbal aeasure eiB|>loy8 four types of questions or Ite^: antonyas » 
analogies, sentence coapletibos, and reading comprehension sets designed to 
test the ability to identify (a) words that are opposite in iseanlngp (b) 
words or phrases that are related to each other in the bsm vay a% other 
words or phrases, and (c) words that are logically and stylistically 
consistent with the sentence in vhfch they appear; and (d) the ability to 
recognise in a reading passage the aain ideas, information explicitly 
provided, implied ideas » the attitude of the author, and the llke« 

Three item types are employed in the quantitative measure: 
quantitative comparisons (testing the ability to reason quickly and 
accurately regarding the relative sizes of tw> quantities or to perceive t 
that not enough information is available to i&ake Such a decision); discrete 
quantitative items measuring basic mathematical skills or regular 
mathematics (balanced among question requiring ^arithmetic, algebra, and 
geomelry and designed to test basic mathematical skills and understandings 
of concepts 9 at levels applicable to Individuals who have not specialized in 
mathematics); and data interpretation (testing th ability to synthesize 
information presented in tabular or graphic form, select data appropriate 
for answering a question, and so on). 

^ The 1981 revision of the analytical measure includes two item types: 
analytical reasoning items (testing the ability to understand a given 
* structure of arbitrary relationships aiming fictitious entities, deduce new 
Information from given relationships, and the like); and logical reasoning 
Items (testing the ability to understand, analyze, and evaluate arguments, 
recognize the point of an argument or the assumptions on which it is based, 
analyze evidence, and the like)* 

Although a continuity effort is made to obtain empirical evidence 
regarding the validity of the total verbal, quantitative, and analyticarl 



*For detailed descriptions of tests and item types, see, for example, ETS 
(1981)a In October 1977, a restructured ^ version of the GRE ^neral Test 
including a newly developed analytical ability measure was Introduced* 
Evidence of its predictive validity with respect to graduate grades was 
obtained in a coof^rative study (Vilson, 1982). Bovver* internal research 
Indicated the need for some change in the itev content of the 1977 
analytical measure and. In October 1981, a revised analytical measure was 
introduced. See Wild, Swinton, and Wallmark (1982) for a review of factors 
involved in the 1981 revision* g ' . • 



'scores for predicting perfo aance in graduate study, little attention has 
been given to atudy of the predictive validity and diagnostic potential of 
part scores iMsed on the various item types—in large part because of the 
lack of any coapelling a priori evidential or theoretical basis for 
expecting differential predictive validity for part scores based on 
different Itctt types oeasuirlng Bore general basic constructs such as verbal 
or quantitative ability. 

For example, iteos regardless of type are selected on the basis of 
internal consistency criteria designed aaong other things to assure the 
coapaiative hoaogeneity of the respective ability aeasures- This is con- 
ducive to relatively high intercorrelations ataong items and between individ- 
ual it€»» and the total scores on the respective tests. Such conditions 
theoretically ailitate against the likelihood, for exanple, that predictions 
based on regress ion-:weighted composites of part, scores would be consistently 
better than predictions based on the total score (in which the potential 
iteo-type part scores are weighted roughly according to their length). 
Although factor analytic studies (for example. Powers & Swinton, 1981; Sock, 
Werts. & Grandy, 1982 > have suggested Khat word knowledge (vocabulary) and 
reading items (reading cooprehension) ^ e distinguishable factorially, this 
evidence alone has not been sufficiently persuasive to suggest that 
predictions based on the "vocabulary" Items and predictions based on 
"reading comprehension" items would be very different. ^ 

However, the need for an empirical evaluation of the predictive 
validity of item-type part scores on the GRE General Test was indicated by 
the results of undergraduate-level validity studies involving verbal 
•item-type part scores on the College Board Scholastic Aptitude Test (SAT). 
For several years, vocabulary (VO) and reading comprehension (RC) scores 
have been reported in addition to the total SAT verbal score. The 
vocabulary score is based on antonyms and analogies and the reading 
comprehensioc score on sentence completions and reading comprehension sets. 
These items are completely parallel in type to those included in the GRE 
verbal measure. 1 

Based on internal analyses of the results of 110 studies conducted by 
the College Board Validity Study Service (VSS) at ETS (Raaist, i981a; 1981b) 
in which colleges had specified vocabulary, reading comprehension, and total 
SAT verbal scores as predictors of freshman grades, the following findings 
emerged: 

o The average validity of the reading comprehension score alone (.373) was 
only .003 points lower than that for the entin; verbal score (.376). 

o In almost one-half of the samples studied, the obsirved validity of the 
reading comprehension score was actually greater ihan that for the SAT 
verbal score. Including the vocabulary score, the validity of which was 
consistently lower than that of the reading comprehension score. 

o When vocabulary and reading comprehension scores were combined in 
regression-weighted composites, the vocahulary score in a number of 
instances was negatively weighted, aithoujh its •impie correlation with 
the CPA criterion was positive, indicating suppression of vocabulary 
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variance In readiag cot^rebeasion^that is, suggeaciag that the 
criterion-related variance in the vocabulary aeasure vas being tapped 
sufficiently by the reading coa^rehension aeasure with which the 
vocabulary score is substantially correlated. 

Th^re va& little iaproveGaent in predicting fresh&an grade point average 
when separate vocabulary and reading cooiprebension scores replaced the 
SAT total verbal score in regression squatioos Including SAT 
«&athematical scores and the high school record. 

These results fiere inconsistent vith expectation and raised questions 
regarding the relative predictive role of tl^ SAT vocabulary and reading 
coBprehension iteuit.* The present study was uuidertaken to assess the 
relationship to acadeaic perfonsanee of siiailarly constructed GKE vocabulary 
and reading cocBprehension iten-type part scores (and of iteB~type part 
Kcores based on iteas in the quantitative and analytical tests as well). 

Study Design, Sasple, and Procedures 

The academic pevf orisance criterion selected for this exploratory study 
was , self-reported undergraduate grade point average (SR-UGPA) routinely 
supplied by moat GRE examinees during the process of test-registration.** 
The SR-UGPA has been found to be a useful research surrogate for an 
otficially cooputed UGPA as a predictor of graduate CPA (Wilson, 1982). 
Moreover, patterns of coefficients for GRE verbal, quantitative, and 
analytical scores vs SR-UGPA, conputed for sampler of undergraduate students 
majoring in selected fields (for exanple. Miller & Wild, 1979) appear to be 
Bimiiar to patterns of coefficients for these predictors vs graduate CPA 
(tor exaopie, Wilson, 1982). 

It was reasoned that results of an exploratory study involving SR-UGPA 
as the academic perfortsance criterion would provide a useful empirical basis 
for initial assessment of the validity of item-type part scores. Such a 
study would also contribute to further understanding of •the utility of the 
SR-UGPA in research concerned with test validation. 



^Several lines of Inquiry have been initiated, including a study of the 
relationship of vocabulary and reading comprehension scores to self-reported 
high school rank, a 'study of the statistical properties of the four 
item-types included in the SAT verbal i^asure, and a study of the 
criterion-related validity of specific verbal item types on one form of the 
SAT verbal test (Schrader, 1984). 

^'^Examinees are asked to report UGPA in the aajor field and UGPK over the 
last two college years « The criterion eisployed was the average of the two 
self-reported undergraduate grade point averages* 
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The study vat dtsiga«d to siwiiate cooditions ctwracterlstic of 
graduate-level validity atixdiea in wklch cosparable data sets for several 
small departttaotal SMplcs are pooled for aaalyeis by field or discipline 
(for example, Wileon, 1979; 1982). 

Study Sample and V*tm 

The study sample and basic study* data ««ere taken froa GEE files on 
e^Blnees tested between Oct. 1, 1981, and Sept. 30, 1982. The study sample 
Included onXy examinees who reported better coemunication in English than 1r 
any other language, who were tested as enrolled undergraduates 
nonenrolied college graduates no more than two years beyond the bachelor's 
degree, and who oeaed both a field of study and an undergraduate school. 
Following procedures described below, data were obtained for examinees 
representing both (a) a relatively large number of undergraduate departments 
from 'sach of 10 to 15 fields repreiienting a wide range of verbal vs 
quant itatlve emphasis (for exiusplc, engineering to English), with some 
fields of relatively mixed emphasis such as education and biology. 

The records of exasinees eligible for iQclusion in the study (by 
enrollaent, citisenship, language status, and data-availability criteria) 
were classified by reported undergradaute major field, and the fields were 
ordered in terms of the total number of designators. Within each field 
classification » examinees were distributed according to designated 
undergraduate school, and schools were ordered according to total number of 
designators without regard to field — that is, in tem» of total volume of 
graduate-school bound, currently or recently enrolled students in the GRE 
pool . 

The 20 most frequently designated fields are listed below, and those 
selected for the study are identified by asterisks: 

psychology political science* economics* computer science* 
biology* chemistry* sociology* other biosciences 

English* geology mathematics* other social sciences 

nursing business music physical education 

education* history* electrical agriculture* 

engineering* 

English, history, sociology, and political ecienie may be thought of 
as rcpreseating primarily verbal fields; chemistry* computer science, 
mathematics, electrical engineering, and economics ««re selected as 
representing primarily quantitative fields; and agriculture, biology, and 
education represent fields not clearly classifiable according to relative 
verbal and quantitative emphases. 

Schools and departments * wre selected, within each of tte 12 field 
classifications, by specifying certain miniwim lis, set after inspection of 
the data, to lead to inclusion of 20 or more samples froa undergraduate 
schools contributing varied numbers of students to the genersl GRE examinee 
pool. Results of the selection process are indicated in Table I* V:%ta on 
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Table 1 

Distribution of Undergraduate Departmental Samples Included in the Study 

fiy ^ize and Field 
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co^poiiitioD and miaarity repte^eatac ioq ia the noaple^ by field» are 

Ai^ fii^y be dtettrmln^d froat Table i» tim: study iiaaplii iocluded 9^175 
individuait fro« m totai of 437 imdttrgraduater dapartMciea in l^^9 difterent 
undergraduate i0fit£tutioQa. In 8r of tlie 12 ££elda» the «adal oufiiber of 
undergraduate ttajora per 4^pmwtmmnt vaa batwan^iO and 19 ^ and diatribut I011& 
ot Ki per departMQt vera poaitiveiy akawed around tteae amsll mo4ml valuei^ 
vichin each fields The«e conditiona are quite aiadlar to those enc9uacered 
in graduate level validity studiaa« 



iiRE Itea-Type Part Score Data 

For each aiessber of the atudy BaBple^ operational GEE acal.J verbalt 
quantitative^ and analytical scores and corresponding itea response data 
were available^ based on one of six different foriu of the GE£ General Teat 
that were used during 1981-82« £ach fora included the saai« total auaber at 
iteus, anC the saaelnuaiber of items by type, as indicated belov: 



Variable 


No. of 




ItetBH 


Vt^rbal Test 


(76) 




22 


Analogieis 


18 


Seneencc coKplecions 


14 


Reading pant^ages ^- 


22. 


Quantitative Te»t 


(60) 


Quantitative coaparisoa 


30 


Regular ns^.uiutics 


20 


Data interpretation 


10 


Analytical Teat 


(50) 


Analytical reasoning 


38 


Logical reasoning 


12 



J 



Raw total scores (based on the 76 verbal, 60 quantitative^ and 50 
analytical test iteois) were computed for each aertber of the study saisple 
taking each fora of the test, and rav part scores were eoi^)uted for each of 
the nine itea types indicated above; in additi^on, a vocabulary score based 
on the 40 antonyms and analogies iteas and a reading coaprehension score 
based on the 36 sentence coapletiona and reading passage aeta were computed 
for each individual. All rav scores were coaputed using the total number 
rigbt acoring procedures introduced during I981'*82« 

The part scores are of differing lengths, with corresponding ^differences 
in reliability. For example* baaed on internal analyaea of tvo forms of the 
GRE General Teat administered during 1981^82 (Wallaark, 1982a; i982b>. 
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cypicdii iev«l8 of Reliability (fiselBaced by Ruder-fUchardson Vorvmlm 20) of 
Che various G&E scores is g«n«r«4 saa^les of ORE ex&mineesi Are»«pproxia«eeiy 
as tollows: . - . 



Test 



Antooyisn Anaiogiei^ ^ 

Sentence 4:<Mpletion8 
Keadlog caaprelieafiion 

Quantitative Test (Total) 

Quantitative cosparison 
Regular auitheaatlcs 
Data interpretation 

Analytical Test (Total) 

Analytical reasoning 
Logical reasoning 



Typical for» 
reliability 

.9(H |76 Itess) 



90 (5,4 tteasl* 
8CH (22 itetta> 

30 (60 itemi) 

8CH- (30 iteisa) , 
75^^ (20 Iteos) 
604 (10 iteas) 

85+ (50 iteas) 

80+ (38 iteas) 
60+ (12 items) 



based on these data, it is estia^ted that a 40-itea vocalmlary score 
and .1 3t»-ireai reading cosprehenaion score vould each have reliabilicfles 
« xc^M ding ,80 in saoples such as those employed in the internal stodiea 
rUiJ^ Since the validity of a test is partially a fucntioa of its 



rt/ J iablli ty, the differences in reliability should be kept in aind in 
^•vMiuating the validity of the various part scores — that is^ a shorter test 
ot a given ability m^y be expected to have soalevhat lower validity than a 
longer test of that ability, given a co&aon external criterion* For 
purpoHes ot this study» reliabilities approxioating t^ ise noted above are 
(iHHiimed to obtain for the various ^asures* 

Pre li Binary operations on the rai^ GRE total and part scores* In 
t valuatlng the predictive validity of operat^ipnal GRE verbal » quantitative, 
AUii analytical scores, the fact that the scWes are based on different 
forms ot the test does not pose p|X>bleas of score comparability across 
torasH. Through a procesS of test equating, raw total scores ea ed on each 
new form of the GRE General Test are placed on' the GRE scale by aeans of 
tormulas that calibrate the scores to oake them coffiparable with those on 
varller forss^ re|ardless of differences in the level of difficulty of the 
rt'spective forms (for example, ETS, 1981 )i. 

■'^wever, equating i ocedures involire only the raw total scores on the 
've lorms of the est — different sets of item typeu within a test are 
esearily parallel in difficulty in a given form, and sets of iteas of 
type are not necessarily parallel in level of difficulty across 



"^In internal analyses, sentence completious are combined with analogies and 
antonyms for statistical evaluation. 
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torwi. Thun, cowbiaing tMM^Bcote dat« «cro8« farm» viebout: fotiAl eqisaticg 
iotrodueei Mom elmmntB of incerpretlvt ambiguity into a vaiidicy atudy^ 
The analysis could hMwm been conducted using only data frqm m aingXe teat 
form (obviating iotatfircitiva eoi^ilicatioaa) but ttiit wkm nut c<maidered 
deairable^ l^ecauaa uaa of aingla-^foni data nould bave eubat^ntlaily 
rei^trieted saAplt^ aize and becauae tbare algtit be differences aeroas forms 
and adaioiatrations in examinee mia^ irltb res{>ect to variables such as aes, 
educational atatus at time of testingf» saleeti^ Ity level of undergraduate 
scbiool attended p and the like-~variable0 that could have some bearing on 
study outcoac^s. 

Formal equating of the raw part scores was*nbt feasible for this 
exploratory study* Without resolving questions regarding the relative 
difficulty of the respective item types within and/or across forms, it mas 
decided to transform raw part and total scores to a coamon scale , by form* 
with full^anareneaa of tte attentuation in validity that might be associated 
with this procedure. In tiiia regard » it was assumed* that item types differ 
only randomly 9 within and across forms » with respect to paralleliaoi* It was 
al^o assumed that attenuating effects due to lack of pralAeliss were not 
likely to affect systi^auitically the relative validity of particular sets of 
Itea^. (See Table 14 and Appendix B for evidence bearing on these assump^ 
tlons*) 

Based on data fur examinees taking each fors of the GEE General Tfest 
without regard %o their field of study, raw part and total score 
distributions iere subjected to a 2-scale transf orji^tion (mean * 0.0» 
standard deviation " 1.0>~-Ch^t is^ raw part and total scores were expressed 
as deviations from the respective form grand means in standard deviation 
units 9 using the means and standard deviations shown in Appendix A.* 

It was reasoned ttwi validity coefficients for the S'-scaled part and 
total scorei^ would be attenuated by any errors associated with lack of 
equating, while coefficients fdctr the GS£ scaled (converted^ fully equated) 
total scores would not*. It was assuaed that comparison of validity 
coefricients for the total scaled (equated) scores with those for the 
z-scaled (uequated) total scores would indicate the overall effect on 
validity of coabining unequated total (and part^ scores across forms* It 
was assumed further that, for coaparing the validity of total test scores 
with that of various^ part scores, the appropriate total scores would be the 
z-scaled transformations of the raw total scores (paralleling transforma- 
tions of the respective part scores) rather than the converted GRC scaled 

total score* 



^Appendix A also provides data on the number of examinees taking eacH forai* 
by sex<» Theae data indicate pronounced differences in 'aex mix across forms 
and administrations; males constituted a majority of examinees taking forms 
adainistered in October, !>ecesd>er, and February while females constituted a 
stronger majority of those taking forms admini&tered in i^^ril and June« gy 
5vnference, differences in major-field mix may also be present » 
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. It sussAry-, the test variables avcilablt for aCudf follotring eh« 
operatltra«. described above ware aa follows: 

• 

V G&£ scaled. verbal score (ctquated across forss) j 

Q GSlfe scaled quaati|;ae£v:e scorch (aa for V) ' \ 

A GRE scaled aaaXjrtical score (as for ¥) 

V* Standardized raw total verbal acore (aot equated by fom) 

Q* Standardised r&v total quantitative score (as for V*) 

A* Standardised raw total analytical acore (as for V*) 

Standardized raw itc«-tyi*e part scores: 



AKT (antonyvis) % • / 

ANA (analogies) 

$C (sentence cosipletiona} ^ . 

RD (reading passages) 

VO (vo^bulary or ANT + ANA) 

ftC (reading cooprehensionvor SC RO) 

QC (quantitative coaparlaon) 

RM (regular oiatteB&atics) { 

01 (-data interpretation) / ^ 

/ 

AR ^mlytic^l reasoning) 
LR rfogical reasoning) 

Finally, one additional set of GRE "to^&l scores" (designated Vl, Q#t 
and Ai^^ respectively) was included, namely^ one in which the various 
Item-type part scores were given equal flight* Given the z*-^caled part 
scores, rotal /erbai^ quantitative^ and analyti^l scores defined by the sma 
oi their respective parts were computed for eactrlaeQiber of Che study sample* 
in these total scores, item types are equally tmighte^ since the stat^dard 
deviations of the z--8caled scores are identicals If validity* coefficients 
tor Vi^, for example, should exceed those of » say» V or V* (in both of which 
the iteic-type subtests are weighted' according to their J ngth)^ then it say 
bi! concluded that the current relative representation of «^e resf^ctive 
P3rts in the total score is not consistent with their relative contribution 
to prediction* 



Study Procedures 



As indicated earlier^; scores on the study variables were available for 
437 undergraduate departitental sasi^lea^ distributed anong 12 sajor fields. 
In order to assess similarities' and differences among the major-field 
ilassitlcations, without regard to department of undergraduate enrollment, 
profiles of means on the z-scaled item-type part scores were developed for 
the J ^ major-field groups. C^^estions regarding the relationship of the 
respective test measures tp the SR-HJGFA criterion were explored using scores 
that were first Standardized by department and then pooled across all 
departa^nts within the respective fields of study*. 
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Pooling rationjil«« Eesalts of regression snslyses in i^Ball ssoples are 
subject to substttotisl supUog fluetustlon. By p04j[ling data for seyeral 
SBsll ssaplcs from siailsr setting<^ (for exaaple,. seversl im<ief graduate 
chealstry departaeats), it is possible to obtain .aore reliable eatiaatet of 
relationships than wsid be pcasible in slngt.e ^saall sanples. In poolifsg 
data across departasnts one useful approach has been to standardize the 
predictor and the criterion variablea within each departoent before 
pooling — that is, to expreas scores . on all variables as deviatioas froa 
departnent sKians in departaent atandard deviation units vaee. for exsaqple, 
.Wilson, 1979; 1982). For each departaental saaple, the aean on each 
variable ia zero and the atandard deviatioii ia unity* 

CoefflHents coa|>uCed for pooled departKentaiXy standardized variables, 
by^ field, aay be thought of as approxiaaticg populatioa values around which 
the co^ff Icieata for individual departneata vill vary due to selection- and 
sa^ling-^related conaiderations (for exaaple, . restriction of range on 
predictors) as as context**apecif ic validity-related factors (for 

example 9 econoaics departments say differ in curricular eo^hasia on 
quantitative Methods of ^analysis). ^ 

A majority of the variation in observed validity coefficients ^n sanples 
frois sliiilar settings tends to be accounted fcr aore by statistical 
artifacts than by situation--apecific validity-related facturs. For exaaple, 
it was f9und that about 70 percent of the variation across 7^6 validity 
studies in the correlation between Law School Adnission Test scores and 
^ f irtit-year law school grades was attributable to differences in saaple 
standard deviations, estiniated criterion reliability, and saaple size (Linn, 
Uarniscia, & Dunbar, 1981) # Similar findings have been reported for 
eoployoent settings (for example, Pearlisan, SchiBidt, & Hunter, 1980). 

When analyses are based on pooled; departasntally standardized data 
within a given fleJd of study, emphasis is on identifying the characteristic 
patterns of relationships between the respective GRE variables and the 
measure of academic performance under consideration • 



tlajor-Fieid Differences in Average Performance 
on ^ Item-Type Subtests 

Table 2 show.^ means on the Gfi£ verbal, quantitative, and analytical 
item-type part scores and the respective total scores for examinees in the 
12 major fields of study. For all except Che converted (GSE scaled) verbal, 
quantitative, and analytical total score means, the means indicate the 
average deviation of the raw scores of exaninees in a givta field from the 
mean of all examinees ih the study aiample without regard' to field, in 
all-examinee standard deviation units* 

Thus, for example, undergraduate English majors i^re •622 standard 
deviations above the all examinee mean on the verbal ' test (STNRAH - 
»622) , #576 standard deviations below the all examinee mean on ^an 
quantitative ability (mean StKRAW Q* - -.376), add so on. Similar 
interpretations may be made for other means in the table. « 
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Table 2 

Means for Major Field Groups on Test .Variables 
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Figure I hi^bXighes diff«r«acet Moog and within fields in perfonance 
on the itea-type part scdrea. Profiles for asjort In tbt four huasnltics 
and social sciences fields and in education (thought of as verl^ fields) 
are sbovn together in the left portion of the figurei those for najors in 
the four sath and science fields and econoaics (tlwught of as quantitstlve 
fields), and in hioiogy and agriculture (though^ of as fields with aixed or 
balanced quantitative and verbal eaphases) are shown In the right portion of 
the figure. 

Within-field differences i^ level of perforaance on the Itea-txpe ^rt 
scores are of particular interest* For exan^l^, majors in the verbal fields 
typically perforaed better on the vocabulary iteas (ANT and c4li4> than on the 
ftTeading coiq>rahen6iott iteas (SC and BO);, tluiy perforaed better on data 
Interpretation (DI) iteas th«a oa quantitative coaparitona (^) and regular 
Katheaatlc:s (RH) iteas; and»^vith the exception of aajors in education, they 
performed at a sharply higgler level on logical reasoning (L&)"than on 
analytical reasoning (AR) iteas. 

Majora In chealstry, aatheaatlcs, engineering, and coaputer science 
tended to exhibit an opposite pattern, with higher perforaance on reading ^ 
cosprehe^lon Itea^ than on vocabulary iten», higher i^rforaance on 
quantitative oaparlsons and regular aatheaatics than on data' Interpretation 
Iteas, and such bett-er perforaance on analytical reasoning than on logical 
reasoning Iteas. r^atheuiatics aajors differed froa the others in this 
cluster prioarily by perforaing coosiderabiy less well on reading passages 
(RO) iteas t^^n on se'itence coapletion (SC) Iteaa. 

Verbal part-score profiles for aajors in econoaics^ biology, and 
agrlculf::-*'e tended to parallel those for the aath and science fields (better 
on readl^g corprehension than vocabulary); on the qr^antitative part scores, 
their profli.es do not exhibit th^ extresse contrast between quantitative 
comparisons, regular^ astheaatics, and • data interpretation iteas 
characteristic of profiles for the aath and science aajors. Vlth respect to 
items in the analytical test, agriculture and biology aajors, like math and 
science aajors, perforued better on analytical than logical reasoning iteas, 
but economics aajors, like the verbal aajors, had a higher logical reasoning 
than analytical reasoning aean. 

AnoL^ier way of assessing variability in aajor-field perforaance on 
item-type subtests within the respective ability ae&sures is to examine (a) 
the relative standing of the several aajor field groups in teriss of aeans on 
the subtests within a teat and (b) the absolute differences in aeans for 
various pairs of subtests. For exaaple, for two parallel tests a high 
degree of consistency in the ranking of field aeans and relatively saall 
absolute differences in corresponding z-scaled aeans would be expected; a 
lower degree of consistency in field ranks coabined with larger differences 
in z--8caled aeans, on the other hand, would be expected for tests aeasuring 
different abilities. 

Table 3 shows for pairs of subtests within the respective tests, (a) 
' whether the ranks of the 12 aajor fields were identical or different and the 
absolute difference in the ranks when differences were presest end (b) the 
absolute difference in X'^core aeans. ' The absence of au entry in the rank 




Figure 1« Prof lies ^of m^an scores on HRE ites-type subtests for undergraduate oajors in 
the fields selected for study 



Source: Table 2 
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difference colu«n iodicACes that the rcoks of the Mjor field gorup on the 
designeCed pair of tests were identical. For additional perspective, Ta^le 
3*1 8how8 rank correlations (rbo) of subtest aeans for the 12 aajor fields, 
actoss as well as vitbin tests. 

o Considering the two verbal subtests, VO and BC, for- 10 of the 12 fields, 
ranks of z-scaled aeans were identical and- the aedian absolute - 
difference ' in z-«caled ateans was relatively small (.067). The rank 
correlation (rho> for the 12 field aeans on these two variables was *888 
(Table 3.1). 

■*«[ 

" ' ' mi 

o For each of the three pairs of quantitative subtests, there vtre 
relatively minor discrepancies in rank order, with no shift of aore than 
one rank. For QC and SM, tim aedian absolute difference in neans was 
quite small (.034); however, aedian differences in field ar^ans were 
greater for -both QC and DI (.128) and .BM and 01 (.175). The average 
rank correlation of field aeans (Table 3.1) for the three pairs of 
quantitative subtests approact^4 -99. 

* 

^ o For the two analytical ability aubteata, AR and LK, aoae ahifta in 

ranking were found for every field, the oedian absolute difference in 
2-8caled »eana (.309) was higher than that for subteata vithin the 
verbal and quantitative ability aeasurea, and the rank correlation of 
field f&eana on AR md LR (rho - .ASl) vaa conaiderably lover than that 
for the other pairs of subteata. ^ 

\ 

The findings regarding field aeans indicate the differential 
development within individuals « associated with field of concentration, of 
the skills and abilities l^ing measured by different item Xypes within the 
respective tests. On balance, the evidence reviewed in this section 
suggests that the item-type part acores are not simply different methods of. 
measuring their respective constructs but that they may represent 
distinguishable components of underlying general abilities with the 
potential for independent iKasurement utility." 

In this connection it is important to note (a) that the decree of 
consistency in major field performance differentials la greater for subtests 
within the verbal and quantitative ability measures than for the two 
analytical ability subtests, (b) that the field ranks on analytical 
reasoning items correspond clc^sly with ranks on all three quantitative item 
types (average rho of approximately .960), and (c) that field ranks on 
logical reasoning items correspond closely with ranks on the two verbal 
subtests (rho ^ .8967 for both L&^VO and LR-RC). Generally speaking, the 
rank correlations in Table 3a 1 indicate that, insofar as major field 
performance differences are concerned, the information conveyed by> the 
analytical reasoning and the quantitative aubtests is similar and that 
conveyed by the logical reasoning subtest and verbal aubtests is also 
similar. 



Exploratory Evaluation of Part-Score Validity 
The analyses involving part-scores on the verbal aeasure were guided by 
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several a priori vorkiog hypotheses, bated on the College Board findings 
cited at the outset: 

i« The GEtE reading coB^rehenslon (RC) subtest (based on sentence 
couplet ions and reading coapreh^usion sets) should be aore closely related 
to S&-HJGPA than the GB£ vocabulary (V0> subtest (based on antonyu and 
analogies). 

2 a The RC subtest should be cofi^iarable in validity to the total 

GRE verbal test^ laeludlng the 40 VO Iteas. ^ 

3« The W'" .iple correlation of the &C,Q*yA* battery with SR-HTGPA should be 
comparab' to that of the V*,Q*tA* battery. 

ccasional suppression of VO, but not ac» variance oay be expected In 
v^/osites including RC, VO^ and other pR£ variables. 

In the absence of coaparable iwrking hypotheses regarding the 
quant itativfes and analytical pairt scores , evaluation of observed 
relationships lor these itea types was guided by Interest in (a) the 
relative contribution of the respective itei&-ty{^ part scores within each 
test to prediction of SR-UGPA, (b) the comparative validity of total test 
scores and the copponent part scores » and (c) evidence suggesting the 
possibility that separately scored iteir-type subtests iftight provide a basis 
for improved assessaent. 



The Verbal Test Part-^Score Analysis 

Table 4 shows pooled within-*^eparta^nc correlations bet wen SR-4IGPA and 
(a) VO and RC scores, (b) various verbal total scores, and (c) a 
best-weighted combination of VO and RC scores, by field, and for all fields 
combined. Validity coefficients for V* (the raw unequated total verbal 
score, z-scaled by test form) wre slightly lower than those for V (the 
converted, GRE-scaled operational verl^al score). This outcome is expected 
because V* total scores, like the respective part scores, i^re not equated 
across fdrn». In coaparing part- and total«-«core validity, V* is judged to 
be the more appropriate total, under the assumption that attenuating effects 
associated with lack of equating across test forms are comparable fdor V* 
and the respective part scores. ^ Coefficients for V# (a total defined as the 
sum of equally weighted scores on analogies, antonyms, sentence completions, 
and reading sets) and V* are assumed to be comparably attenuated due to 
errors associated with lack of equating across forms. This same line of 
reasoning is applicable , of course, to later consideration of data on the 
quantitative and analytical measures. i 

The validity coefficients for the verbal measure varied by field 
generally in accordance «rith the expectation of higher validity in the more 
verbal fields than in the more quantitative fields. Hiis was true without 
regard to the particular verbal measure under consideiation. Eowever, for 
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Table A 

/ 

Pooled Wlthln-Departmenc Correlations of Selected Verbal 
Pam and Total Scores with SR-UGPA, by Field 



?ielcl - (N) Verbal part Verbal total ' Difference in Validltjr 
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Note. V* is the raw total verbal score, z-scaled by fona. 

V# la an equally weighted aum of four verbal part scores . 

V 1b the converted. GRE scaled verbal score, equated across forms, 

V0,RC is a best weighted coaposite of the designated part scores . ^ 

Entries are correlation coefficients without decimals. 
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eeodoaicSy aaoog th« note quantitative fields , th« verbal te«t had validity 
coefficient* covparabie to the coefficients for the English » history, 
sociology » and political science saaples* 

With respect to patterns of verbal part-score validity, the findings in 
Table 4 are generally consistent vith the basic working hypotheses outlined 
above. 

o Considering first the all-fields coefficients (equivalent to weighted 
averages of coefficients for 437 departwents vithout regard to field), 
the validity for EC is greater than that for VO (coefficients differ by 
•038, as indicated in the RC va VO differenffe coluon), and this was true 
for 10 of the 12 fields* For chesistry and computer science 
departaents, the Man VO coefficient was higher than the aean RC 
coefficient, tmt the aean dlfferiince vas less than the average for all 
departments in absolute magnitude. 

o RC alone was about as valid as V* including the VO 1 terns • Considering 
data for all departoents, without regard to field» coefficients were 
•301 and .309, respectively. RC vas actually slightly wore valid than 
V* in several fields. 

o However t a best-«feighted coaiwsite of VO and RC did not* yield auch 
better prediction than the V* total scote, siaiilar to the results 
observed with SAT vocabulary and reading comprehension when they were 
slailarly treated. Largest differences in validity beti^en the VO^RC 
composite and V* occurred in three of the four fields in vhich RC was 
more valid than V*, and in which differences in validity beci^en RC and 
VO were greatest , namely, electrical engineering, mathematics, and 
political science. 

The data in Table 5 lend support to the working hypothesis that the 
multiple correlation of an RC,Q*,A* composite with SR-UGPA should be 
comparable ,to that of a V*,Q*,A* com{K>site. 

o For all departments, without regard to fi^ld, the coefficient for 
RC,Q*,A* was only .002 points less than that for V*,Q*,A*, and .005 
points less than that for VO,RC,Q*,A*— that is, when VO was added to the 
RC,Q*,A* battery there was little increase in the amltlple correlation. 
There were no notable exceptions to this general finding by field. 

Table 6 provides evidence regarding the relative weighting of two sets 
of verbal part scores, namely, VO and RC (Set 1), and the four basic verbal 
item types, namely, analogies ' antonyms^ sentence coiiq>letioos, and reading 
comprehension sets (Set 2), when included in a batfery with and A*. 

o The data in Set 2 indicate, among other things, (a) that, over all 
departments, the relative weighting of sentence coaq)letions and reading 
items (components of the RC score) was approximately equal, (b) that one 
of these two RC item-types was the highest of the four verbal item typeft 
in all fields tmt one (agriculture), hut (c) that tl^ relative weighting 
of the SC ^nd RO itetes, when th^y were allowed to compete independently, 
varied across fields without regard to their verbal or quantitative 
emphasis.' 



ERIC 



23, 



Table 5 

Multiple Correlation with UGPA of Quantitative, 

Analytic^^I pfiud Selected Verbiil Scores, by Field 
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350 


-002 


-004 


002 




( 976) 


299 

• 


306 


306* 


007 


000 


007 




(1649) 


356 


366 


366 


010 


-001 


001 


All Fields 


•(9375) 


361 


366 


363 


002 


-003. 


005 



Note: Entries are luultiple correlation coefficient^ or differences 
between designated coefficients without decimal^. 

VO « ANT -t- ANA - Vocabulary ^ 
KC ^ SC RD ^ Reading' Comprehension 

V*, Q* and A* are raw total scores on the respective tests^ 
2-~sci^led by fona. ♦ 

**A* variance is suppressed in this composite. 

VO variance is suppressed in this composite^ 
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Table 6 

R<jUtiv« Weighting of Two S«t8 of Verbal Part Scores, Quantitative 

and Anaiycical Scoreu to Coaposites for Predictitag by Field 





vw/ 




Set 1 
Belts wftichts 




* 

(R) 






Set 2 
Beta weitthts 






, (R) 






VO 


KC 


Q* 


A* 




AKA 


ASfT 


SC 




Q* 


A* 


English 


( 884) 


166 






017 - 


(403) 


104 


077 


168 


168 


0^ 


010 


(411) 


History 


( 584) 


159 


c Jo 




-045 


(382) 


037 


101 


241 


072 


078 ' 


-039 


(398) 


Socio 1 


( 364) 


126 


215 


035 


171 


(447) 




031 


056 


. .168 


034 


169 


(451) 


Pol Sci 


( 545) 


038 


256 


232 ■ 


-054 


(420) 


093 


-045 


108 


180 


230 


-060 


(425) 


Chetn- 


( 644) 


064 


031 


279 


072 


*(375) 


020 


054 


007 


030 


277 


072 


(376) 


CS 


i 647) 


106 


Oil' 


274 


064 


(374) 


059 


063 


-06/' 


065 


274 


059 


(376) 


Kaeh 


( 251) 


-012 


237 


300 - 


-034 


(412) 


-128 


077 


'209 


098 

• 


296 


-016 


(4311 






Iff 

—Lf* I 


179 


320 


065 


(406) 


-089 


-061 


068 


123 


320 


066 


(406) 


Eeon 


( 663) 


099 


196 


144 


143 


(458) 


040 


066 


102 


129 


143 


145 




Biol 


(1318) 


050 


ill 


187 


054 


(334) 


028 


036 


033 


142 


189 


052 


(357) 


Agric 


( 976) 


085 


038 




076 


(306) 


026 


065 


050 


009 


179 


073 


(309) 


Educ 


(1649) 


122 


131 


148 


048 


(366) 


099 


040 


103 


053 


142 


048 


(371) 


All Fields 


(9375) 


079 


145 


185 


047 


(366) 


044 


044 


079 


090 


185 


047 


(366) 


Note. Entries are 
without decinials. 


standard partial regression 


(beta) 


weights and 


multiple correlation coefficients 



VO - ANT + ANA « Vocabulary: RC - SC + RD - Reading Comprehension 
Q* and A* are raw total test scores, z-scaled by form . *" 
Negative weights reflect suppression of variance; zero-order coefficients are positl 
Underscored weights are estiaated to be significant at the ,05 levels 
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o Suppr^fior efttftce«» iQdic«e«d hf negative regretsloa ireighta Tar 
predictore ttttt mrm positively correlated vith a criterion , nere preaent 
tor'VQ and/or VO ccwpoQent ites typea in aoalyaea for pa^beamtlca^ 
eaglcaeriug^ and political acieaca departt&e&ta, consiateat vlth the 
hypotheaia of occaaioaai auppreasor effects for vocabulary iteayi; in one 
#tiaiyais (conimter science departaesitfif), the eeacence coopletlon aubceat 
was negatively neighted,, contrary to l&ypothesla. 

o The Set 1 and Set 2 miltlple correlatlona are Identical in the analyaia 
over ail departaienta and are al«oat ao in the i^upective field aonlyaaa* 

* * 
The analyaea revieiued above indicate differetfeea in the criterion- 
relatad validity of the VO and RlTa^ubteata favoring t^ KC subteati which 
appears to be carrying tfioet of thf^^dictive .^iidity load^ in the total 
verbal acore i^hen the criterion la SE*-ilCrFA» 



The Quancitative Teat Part-Score Analyais ^ > 

Table 7 provides data on. the 'relationship of the ^hre;f quantitative 
item-type part scores to SR^GFA. The correlatiohs of three, quantitative 
total scorei^^ na^ly/ Q*, and with the aaae criterion are alao 

shovm. As ixpected» the validity coefficients for the various ^quantitative 
total scores are higher for the sattf and science and econojaies departtteists 
than for the others less quantitative fields; however the higher validity 
of quantitative scores for political science departments than for other 
verbal departanencs vas ^ot escpected* ^ * * 

In the^ absence of an & priori basis, for expecting particular patterns 
of differential validity for the respective item types » perhaps the aost 
relevant general consideration to^ be kept;/in aind is that the three 
quantitative subtests differ in length* QC includes* 30 quantitative 
comparison Items » RM includes; 20 regular ^theiaatics iteas, and pi includes 
iU dat£ interpretation iteas* Thus* we wciuld eacpectf validity 

coefficients to vary with test length if the three Iteir-typeii are actually 
homogeneous with Respect to the abirities they tap^ ^ 

o For ail departments/ the validity patterns for QC, SM» and 01 followed 
the vsriStion-accurdiag-to-length hypothesis, and this was true for 
several of the ff^ld snaiyses as weil« However, there were exceptions* 
For e'xaapie, validities w^re somewhat higher t^n^ those for QC in 
several fields, most notably so in otatheiaatics ; DI validities comparable 
to those for QC were obtained in analyses for agriculture » English, and 
sociology (which are aa^^ng the fields in which students performed better 
on the^ DI subtest th«n on the QC and RM BtsbtestS'^^see Figure J)*^ 

f o A composite of the separately weighted .p^rt scores did not result in 
better prediction than that provided by Q*, based on.the analysis over 
cill depsrtoients. Only in Che analysis for^ i^t hematics depsrtments^ in 
wh! ch the regular mat hesa tics iyps had uniquely high validity, wias 
ttiere a notable exception to the foregoing generalization. The content 
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Table 7 




Pooled Within-IJ«par£toent Correlations of Quaatitative Part 
«nd Various Total Scores with (IGFA, by Field 



(N) . Quantitative part Quantitative total 

scores acorns 







QC 


^ KM 


m 


Q* 


Ql 


Q 


QC.RM 


















Di 






r 


r 


r 


r 


•r 




R 


English 


( 884) 




209 


192 


238 


241 


246 


245 


History 


( 584) 


1 6 


203 


* 126 


212 


205 


225 


224 


Sociology 


( 364) 


226 


259 


216 


285 


293 


310 


286 


Polit Sci 


( 545) 


353 


269 


216 


353 


335 


362 


362 


Chemistry 


( 644) 


330 


305 


203 


33& 


340 


371 


366 


Computer Sci 


( 647) 


285 


293 


258 


350 


349 


356 


350 


Mathematics 


( 251) 


294 


366 


210 


356 


340 


378 


382 


Elec Engin 


( 850) 


346 


290 


212 


378 


356 


397 


380 


Economics 


( 663) 


307 


283 


212 - 


348 


339 


358 


348 


Biology 


(1318) 


268 


246 


182 


296 


287 


310 


298 


Agriculture 


( 976) 


216 


234 


217 


276 


280 


306 


278 


Educ 


(1649) 


285 


243 


19j 


302 


292 


302 


304 


All Fields 


(9375) 


274 


257 


201 


308 


300 


320 


308 



Note. Q* is the raw total quantitative score » z-scaled by forrru 
ig §n e<iualljf weighted siOT of quantitative part scores. 
Q is the converted quantitative score, equated across fores. 
Entries are correlation coef f IcientP without decimals. 
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of the regular matheMtics lC8«a nay overlap i»or« with the con*:eot of 
Cht> fliajor field for satbesatica sajora tbao for aajors in the otl^r 
fields. If tfo, this would help to expleia the strong predictive 
.ralidity of these itens and would b« conBiBteot|irith f ladings of^ 
previous res<tarch indicating characteristically highlr validity for the 
GKE Subject (Advanced) Tests th^n for the General (Aptitude) Test (see, 
for exaople, Viliioghaa. 1974; Wilson, 1979; 1982). 

Tttble 8 provides insight into the relative tieighting of (^C, iH, and 01 
when the three part scores i^re included in a battery with A* and V*. The 
difference in nultipie correlation between the QC,SM,DI,A*,V* cooposite and 
thv^ Q*,A*,V* coBposirft^ is also shows. The predictive load, relative to the 
SR-uePA criterion, in the quantitative test Is being borne primarily hf the 
qc and RM iteas. Judging from the findings in Table 8. 

o DI contributed only slightly to prediction, generally, and attained 
statistical significance only in the analyses for coaputer scient^ and 
agriculture departaents; suppression effects tirere- found for DI in 
analyses fbr two verbal ffelds (history and political science) and 
matheaatics. In » BtBpvlf*'^ regression prograa, QC, 8M, and 01 were 
entered as. a set followed quentially. by tl^ introduction of A*, Chen 
V^. In the three analyses stowing DI suppression (and in ail other 
analyses), the flight for DI was positive in the initial ^quantitative 
set. The DI weight becaae negative only after the introduction of tt^ 
final variable (V*) in analyses for, history end political science, but 
after the introductifiao of A* in the asrhe«|£lf» analysis. 

o Separate treataent of QC, EM, and DI part scores in a battery with A* 
and V* did not lead to better prediction than that provided by Q*,A*,V* 
(see difference coluan in Table 8). 

The Analytical Test Part-Score Analysis ^ 

The analytical ability measure introduced in October 1981 is a revised 
version of the analytical aessure Introduced when ttffi G8E General Test was 
restructured in 197 7. Thete is eapirical evidence regarding the validity of 
the October 1977 analytical aessure for predicting graduate school 
performance (for exaaple, Wilson, 1982), but evidence regarding the October 
1981 version is more liaited. Evidence of positive relationships bet^en 
SK-UGFA and analytical reasoning and. logical reasoning iteas, respectively, 
van reported by Wild, Swinton, and Wallaark (1982) in studies leading to 
the revision of the 1977 measure. In those studies, logical reasoning iteas 
were found to be more closely related to SR-UlPA thMi analytical reasoning 
Items in samples that were not differentiated with respect to field. 
/ " 

\ XTie aftaly-sefe'-XEportird in this «ecti«»-pfev4de^ ■e-vi4e«€e^~r«g«r4iag the 

relationship of the various analytical ability total scores (A*, A#, and A) 
and the coatponent analytical ahllity Itea types, naaely, analytical 

reasoning? (AR) and logical reasoning (LR), to SR-UGPA in samples classified 



Table 8 

Relative Weighting of Quantitative Item-Type Part Scores in a Coaiposite with A* imd Vf* 



Field 


(N) 




Beta we 


4 £»V| fO 




/V "Q^d T\X 


Increase 






QC 


EH 


DI 


Xa* 


V* 


A*,V* 


over 






• 










(R) 




English 


( 884) 


-004 


045 


654 


018 


355 


404 


003 


History 


( 584) 


053 






-040 


347 


380 


006 






-037 


058 


013 


182 


301 


444 


002 


Polit Sci 


( 545) 


223 


050 


-007 


-041 


253 


416 


006 




{ 644) 


184 


132 


028 


074 


0'7 


381 


007 


Computer Sci 


C 647) 


US 


126 


log 


OM 


ioi 


370 


000 


Matheiflatics 


( 251) 


094 


265 


-084 


-021 


191 


418 


023 


Elec Engln 


( 850) 


219 


124 


047 


082 


030 


389 • 


003 


Economics 


< 663) 




068 


002 


159 


260 


454 


000 


Biology • 


(1318) 


126 


089 


008 


063 


1«0 


352 


002 


Agriculture 
* 


( 976) 


042 


095 


087 

« 


073 


114 


308 


002 


Education 


(1649) 


117 


049 


002 


052 


226 


368 


002 


All fields 


(9375) 


107 


089 


025 


051 


197 


364 


001 



I 

X 



Note. Entries are standard partial regression (beta) weights or ipultiple correlation 
coefficients without decimals. ^ 

Underscored weights are estimated to be statistically significant Cp< .05). 
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by field of study. In evaltmciog the observed correlations, is Tsble 9, it 
is important to keep In nlnd th&t totsl scores on tbe SCKiteat snslytical 
i^aeure are aore hesvlly influenced by performance on the 36 analytical 
reasoning itei^ than by perforaance on the 12 logical reasoning iteos. 

Generally speaking, typical validity coefficients for the various' 
analytical total scpr^s tend to be soaewhat higher in the prinarlly 
quantitative f ields^Cexcept aatheaatics) than in the verbal fields (except 
sociology), ^liowevir, the AR and L& subtests do not have siailar patterns of 
validity coefficients across . verbal and quantitative fields. 

In this regard, pet^haps tte aost striking aspect of the ,part~score 
validity data in Table 9 Is the strong contribution to prediction of 
SR-UGPA, relative to that of the 3S-iteB AS subtest, of the LS subtest based 
on only 12 logical re loning^'ltesw. 

o For all departoents, th» validity of the ISt, subtest was .225 as coa^iared 
to .229 for the longer AK subtest. 

o In seven analyse^, the validity coefficient for LR was approxisately 
equal to or greater than that for AR. 

o In thi-«>e analyses (for history, political science, and education 
departments), t\m LR Subtest validity coefficient was greater than that 
tor the A* total (which included tbe AR itesm). 

u AR validities tended to be soaewhat higii^r for the basically 
quantitative fields th'an, for the basically verbal fields; for LR, 
validity coefficients tended to show an^^posite pattern. 

The relative vrei^htlng of AR and LR in an independently computed 
coiaposlte and their weighting in a ^coaposite with V* and 0* are shpvn in 
Table 10. 

o When AR and LR were treated as predictors, AR weights were aocaewhat 
higher than those for LR in composites for the chemistry, cooputer 
science, mathematics, electrical .engineerings biology, and agriculture 

o LR weights were sucoewhat higl^r than AR weights in analyses for history 
and political science (among the skore verbal fields)^ for economics 
alone among the more qu£^ntitative fields* and for education. 

Although, when considered jointly as an independent battery, weights 
tor both AH and LR reached the •OS level of statistical significance in most 
ot the analyses, neither AR nor LR made a consistent, substantial 
contribution to predictioi? when treated as elements in a battery that 
included the V* and total scores (cf# results in Table 6 for verbal 
subtests combined %rtth A* and Q* and in Table 8 for quantitative subtests 
combined with V* and A*). 

o Mnly the beta i^ight for Ul was significant in the cverail departo^ntal 
analysis^ and its contribution to prediction was relatively slight (beta 
« .058 as compared to approximately #190 and A 85 for V* and Q*). 
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Table 9 

Fooled Ulthin-Departaaent Correlations of Analytical Part 
Scores and Various Total Scores with UGPA, By Field 

Field (K) Analytical part Analytical total 



scores - gcores 







AE 




A* 


Al^ 


A 


, AR.LR 

i 






r 


r 


r 


r 


r 


R 


r.Qgiigh 


( 884) 


202 


200 


236 


2^8 


239 


243 






146 


239 


195 


£SS 






Sociolomr 








JWi 






372 


Pollt Sci 


( 5A5) 


162 


269 


229 

• 


272 


232 


278 


1 


( 644) 


255 


175 • 


275 


262 


282 


270 






224 


221 


259 


267 


256 


266 


HatiieiMtics 


C 251) 


218 


214 


239 


250 




263 


Elec Engln 


( 851) 


264 


199 


282 


273 




287 


Economics 


( 663) 


301 


335 


358 


386 


361 


388 


Biology 


(1318) 


224 


179 


244 


242 


251 


250 


Agriculture ' 


( 976) 


220 


164 


240 


238 


255 


246 


Education 


(1649) 


2M 


256 


242 


296 


279 


297 


All Fields 


(9175) 


229 


225 


264 


274 


270 


275 


Note. A* is 


the raw 


total analytical score, 


S'Scaled &y fom 







A# is an equally weighted suss of the analytical part scores . 
A is the converted analytical score, eqiiated across forms. 
AR,iR is a best weighted composite of the designated part scores 
Entries are correlation coefficients without decimals. 
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Table 10 

Relative Contribution of and LR to Prediction of SR-^UGPA In an Independent 
Cooposiite and in a Composite Including Q* and A* 



Field 



(N) 



Multiple 
Beta weights correlation 

(R) 



AR 



LR 



English 


( 


6b 


148 


145 


(243) 


History 


( 


584) 


074 


214 


(249) 


Sociology 


( 


364) 


241 


214 


(372) 


Folit Sci 


( 


545) 


078 


242 


(278) 



Beta weights 



Multiple 
correlation, 



AK 


LR 


Q* 


V* « 


(R) 


012 


013 


070 


353 


(401) 


-063 


071 


097 


319 


(361) 


133 


107 


031 


284 


(444) 


-102 


088 


248 


231 


(422) 



ChemiKtry" 


( 644) 


221 


094 


(270) 


Co&tput«r Sci 


( 647) 


160 


157 


(266) 


Hachematics 


( 251) 


163 


158 


(263) 


Elec Engin 


( 850) 


221 


119 


(2S7) 


^conotoics 


C 663) 


209 


261 


(388) 


Biology 


(1318) 


185 


116 


(250) 


Agriculture 


( 976) 


176 


119 


(246) 


Education 


(1649) 


l^i 


197 


(297) 


All Fields 


(9J75) 


170 




(275) 



069 


-003 


283 


089 - 


(374) 


-002 


092 


289 


083 


(376) 


-010 


024 


288 


175 


(395) 


065' 


042 


314 


024 


(387) 


094 


> 127 


147 


m 


(461) 


044 


039 


191 


173 


(351) 


054 


047 


178 


103 


(307) 


008 


080 


154 


206 


(370) 


022 


058 


190 


2_85 


(365) 



Note. Decimal points have been omitted from all coefficients. Underscoring indicates estimated 
St ^ istlcal significance at the p < .05 level. 



o Suppression effects were found for A& io four depertaeatsl snaiyses end 
f or« la one; in Che , ssaples iavolvcd, AR or LK eritcrioa-related 
verlacce i^ aore then sufficiently repre&eated in verbel end/or 
quentlestlve total scores. 

r 

"^o Weights for both MS. sad were ststlstlesily slgalflcsnt la ooly two 
field . sosly see (sociology sod ecoaoMlcs) and LK was significant In a 
third (education). 

The data In Table 10 suggest thst the aualytlcsl test, as currently 
defined by the 38 AR and 12 L£ lte«a, is \ not providing very wuch unique, 
SK-UGFA--reiated lofonuition* Ilils conclusion Is reioforced by the data in 
Table 11, which pemlt coi^iarlson of aultlple correlations with SVL-^M of 
V*Q* only and those yielded by adding A* and AE and L&» respectively. 
Increaents In & due to adding analytical teat scores to V* and Q* typically 
were quite' aaall. In etraiuating this finding, it is useful to know that 
V*Q* alone yielded a higher aultlple correlation with SS-dGPA than eitter 
A*V* or A*Q* in 9 of the 12 field analyses and in the total saaple. 

Understandij^ of these findings is advanced by reference to Table 12 
and Table 12.1. In Table .12 It «ay be seen that L& is aore closely related 
to a verbal subtest (RC) than to LR. Froa Table 12.1 it may be dete aioed 
that the average witbih-test intercorrelations of verbal subtests (.503) and 
quant itative subtests (.476) are greater than that observed for the two 
analytical ability subtests (.360); Moreover, the correlation of LR with 
three of the four verbal subtests is higher than that of AR with thes^ 
subtests while the correlation of AR with- each quantitative subtest is 
higher than that of LR with these subtests. Intercorrelations corrected for 
errors of aeaaureiBitnt shown in Table ,12.1 (below the diagonal) lead to 
sicdiar conclusions. In essence, AR items tend to have oore in coaaon «^th 
quantitative iteass than with LR iteias, while LR iteas have aore coaoon 
variance with verbal iteas than with AR iteas. 



^erbai. Quantitative, and Analytical Part Scores 
as & Battery 

Table 13 shows acjor fiudings of an analysis of the regression of 
SR-UGPA on seven itea«type part scores, naaely, VO, RC, QC^ RK, DI, AR, an^ 
LR. Standard partial regression (beta) weights sre 'shown for variables 
vse'lecced by stepwise regressi<MS as contributing at least .001 to E-squared. 

o The consistent significant contribution to prediction of the i and/or 
VO subtests is noteworthy; both are slgnificaot in four analycies, RC 
only is significant in five, and VO only in three (though acting ae a 
suporessor in one). 

o The pare tfcore that &ppe«rii to be coatributlag lemut to the buttery is 
daCa £Qter|iret«tioti <DI)a Movever^ the score for this subtest set the 
statistical sigoiflc&nce criterion In the cospister science |uid 
agriculture analyses. 
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Table 11 



Field 



Incremental Contribution of the Analytical Measure (A*) in 
Part-Score and Total -Score Fona to Prediction of SR>UGPA 
After Taking V* and Q* into Account, by Field 



Cosposite predictor 



Difference in R 





% 


(R) 


A*. 
(R) 


(3) 
V*,Q* 

<R) 


(2-1) 


(3-1) 


En 0 1 i «£h 




Art! 


401 


401 


000 


ft A A 

000 


History 


- ( 584) 


374 


374*^ 


381* 


V w V 


007 


Sociology 


( -364) 


420 


442 


444 


020 


024 


Polit Sci 


C 545) 


408 


^410*^ 


422® 


002 


014 


<:hen3istry 


( 644) 


370 


174 


. 374** 


004 


004 


Computer ^ci 


( 647) 


368 


371 


376® 


003 


008 


Mathematics 


( 251> 


395 


395*^ 


395® 


poo 


000 


Elec Engin 


( 850) 


381 


. 386 


3o7 


005 


006 


Economics 


( 663) 


439 


455 


461 


016 


022 


Biology 


(1318) 


346 


•350 


351 


004 


005 


Agriculture 


( 976) 


300 


306 


307 


006 


007 




(16A9) 


364 


366 


370 . 

i 


002 


006 


All Fiei^fcT' 


(9375) 




363 


365 ^ 


002 


004 


N oj_o<^ Entries 


are correlation coefficient's 


without decimal 


s. 





AR negatively weighted 

\r Negatively weighted 
A* npj?atively weighted 
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tabu 12 

Int«comlatioo» of Aaalyeieal Teat Part SeorM. nd tlmit CorrftlAtioos 
with Scl«ct«d Varbal asd (^eitativ* Tast Part Scoraa. by FlaJd 



Field 






4tt eciir* 


AS acora vs 


LR acora va 








~ Ut acMira 


ftC 




IC 










r 


r 


r 


X 


r 




( 


Oo4) 


373 


444 


501 


449 


326 


Hiacory 


( 


584)- 


337 


404 


m 


472 


321 


Socio logy 


( 


364) 


341 






44 Z 


276 




m 


545) 


352 


436 


476 


468 


404 


ai ■ 


< 


644} 


369 


415 


436 


467 


295 


Co«^utR Sci 


( 


647) 


404 


416 


422 


483 


22'i 


Ktth«matlc« 


( 


251) 


346. 


362 


437 


512 


290 


EI«c Eagln 


{ 


658) 


363 


407 


451 


484 


346 


EcoQcmics 


{ 


663) 


358 


387 


461 


508 


366 


Biology 


(13X8) 


-33J6 


409 


408 


m 


238 


Agriculture 


{ 


976) 


368 


451 


478 


486 


321 


Education 


(1649) 


366 


469 


557 


507 


r 372 


All Field* 


(9375) 


360 


429 


475 


469 


318 



Note: Entriee ere correletlon coefflcleote without declaele. 

The higher coefficient lo e given covperleon le underscored. 



Table 12. i 

ff 

Pooled Vithio'-DepertBent Intercorreletlons of 
Itett-Typ^e Pert Scores: Totel Scsple 





AKT 


AHA 


SC 






BM 


DI 






ANT 




528 


493 


■'«86 


277 


266 


233 


282 


356 


ANA 


843 




509 


486 


337 


290 


260' 


335 


371 


SC 


711 


784 




519 


336 


274 


267 


332 


394 


SD 


608* 


650 


749 




360 


322 


316 


koa 


426 


QC 


433 


372 


485 


450 




548 


440 


475 


318 


RK 


343 


400 


378 


415 


707 






487 


310 


01 


336 


359 


388 


456 


635 


651 






264 


AR 


441 


441 


407 


510 


594 


628 ' 


600 




360 


U 


514 


549 


593 


615 


459 


462 


440 


519 





Kote: Vsluee ehove the dlegonsl ere observed correletlone; those 
below ere cor rected for errors of messurcseist by use of the 
fonsule r^^/^r^r^^5 reliabilities ere estimeted roughly* 



Entries ere coireletloa coefficients without declaels. 
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Table 13 



Beta Weights for Subsets of Icea-Type Part Scores Selected by Stepwise 

2 

Regression According to a Contribution to* R Criterion, By Field- 



Field . 

English 
History 
Sociology 
PoliticBl Science 
AU VERBAL 

Chesiistry 
Coiq»uter Science 
Madienaatics 

Electrical Engln 

Ecoats&ics 

ALt QUANTITATIVE 

Biology 
Agriculture 
ALL BALANCED 

Education 

ALL FIELDS 



Part-score beta veldts 
VO RC QC RM DI AR 



Selected y*,Q*,A* 
LR set (R). CR) 



17 
15 
12 

12 

08 
09 

-id 
09 



05 
10 
06 

11 



23 
22 
21 
26 
23 



22 
19 
IB 
12 

17 
11 
12 



04 

22 
05 

19 
13 
08 
22 
08 
15 

13 
05 
09 



04 
09 
06 
04 

06 

f 

14 
13 
26 

11 
07 

13 

09 
10 
09 



12 06 



05 



10 
04 
04 



2i 
04 



-08 
11 
-10 



08 



06 
09 
£5 

05 
06 
05 



07 14 12 10 



06 

» 

10 
OS 

09 



12 
06 



05 
04 

08 

06 



406 
392 
451 
437 
(404) 

381 
377 
434 

, 409 
463 

(391) 

356 
309 
C332) 

373 



402* 
374" 

(396^) 



374 

371 
.395 
386 
455 
387 



ab 
ab 
b 

abc 
abc 



350 
306 
C330 

366 



ab 
abc^ 

ab 



06 (367) 



(363^*1 



( 884) 
( 584) 
( 364)^ 
( 545) 
(2377) 

( 644) 
( 647) 
( 251) 
< 850) 
( 663) 
(3055) 

(1310) 
(.976) 
(2294) 

(164?) 

(9375) 



Note. Entries are regression and correlation coefficients without decimals. The 
regression coefficients tabled are for part scores contributing at* least 
.001 to R~squared; underscorecf coefficients also met a .05 statistical 

significance criterion. « 

*V* significant, .05; ^Q* significant, .05; *^A* significant, .05 
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o The regular osthsMtics eubeeore contributed «i ie««t .OOX to R-^quered 
in every anaiysi* end li the oaXj part score for wiiich this was true* 

o AS and/or LR were selected as part of the aost efficient part-acore 
battery la 10 of the 12 field analyses (though with A& variance 

suppressed In tiro). " 

( 

! 

Generally speaking, the best . wiiighted coaposltcs of selected part 
scores yielded s(»ewfaat higher Multiple correlations with SR-^ICPA than the 
three total terft acores; no corrections for shrinkage have been lude, 
however. In avaluatiog the findings in TAble 13, it is iaportant to note 
that the aubtests involved are of differing lengths and reliabilities, that 
the analysis did not attempt tc adjust for these ^tors» and that, given 
flMderately intercorrelated predictors auch aa those involved in the 
.analysis, regression weights arc sensitive to relatively saall changes In 
validity. 

» 

Comparability of Regression Results for Bnequated 
and Equated Total Scores * 

^ • 

The preceding analyses were based priaarlly on Cast seorei^ . that were 
not equated acroas test foras. To what extent do patterns of findings based 
on unequated score data provide a baais for projecting results that aij^bt be 
obtained if equated part and total scores imre to be employed? Table 14 
presents findings hearing oo the conparability of regression results for 
unequated (V*(Q*,A*) and equated (V,Q,A) total scores on the respective 
tests. 

While there are dlffereoces tu detail in the results of the parallel 
analyses, the relative mightiiig of the verbal » quaotltatiw, and analytieal 
scores, and the relative Mgnitudes of t\m AMltiple correlation 
coefficients 9 by fields are essentially thm saae for the tira analysee. It 
seefts reasonable to infer that cooparable results sight be expected for 
parallel analyses involving equated and unequated part scores (see Appendix 

FroB Table 14 it aay be determined that the miltiple correlations for 
the V*9Q*^A^ composites are soaavhat loirar than those for the V,Q^A 
composites due, it is assniiedp to error associated vith lack of equating for 
V*, and A* across forms* 



SuBoiary of Trends in Findings 

Major trends ^ the findings bearing on the predictive and/or construct 
validity of item-type part scores ar'^e suitsarized belosr, by testa 

With respect to the verbal ability measure^- 

o Althought there are some exceptions, by field t reading comprehension 
items (SC RD) tend to be more valid than Vocabulary items (ANT 4^ ANA) 
and the saoie tends to be true of the KC and VO coi^naot Item types # 

o &C and VO item types appear to be contributing to tjhe prediction , Oil 
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Table 1*4 

« 

Cof&parisons of Regression Results for Unequated Raw Total Scores 
(V*, Q*, A*) and "GRE Scaled Scores (V, Q, A), by Field 



Unequated Scores 



Equated Scores 







V* 


Q* 


A* 


Km 


. V 


Q 


A 




English 


( 884) 


. 353 


066 


026 


(402) 


354 


075 


025 


(407) 


History 


C 584i) 


345 


097 


-035 . 


(374) 


353 


115 


-041 


(3QS) 


.Sociology 


C 364) 


296 


025 


189 


(442) 


' 294 


Otl 


184 


(459) 


Pc-llt Sci 


( 545) 


256 


242 


-044 


(410) 


253 


256 


-047 


(417) 


chemistry 


( -644) 


080 


261 


073 


(374) 


. 085 


294 


074' 


(389) 


Computer Sci 


( 647) 


103 


276 


060 


(371) 


108 


287 


051 


(377) 


Mathematics 


( 251) 


193 


296 


-024 


(39«) 


217 


308 


-022 


(425) 


Elec Engin 


( 850) 


030 


317 


081 


(386) 


040 


336 


075 


(405) 


Kriifiorni cs 


( 663) 


258 


147 


153 


(455) 

* 


272 


, 159 


143 


(468) 


Biology 


(1318) - 


178 


191 


062 


(35P). 


195 


204 


058 


(370) 


Af^ricul ture 


( 976) 


112 


ill 


074 


(306) 


125 


206 


066 


(335) 


F Jurat ion 


(1649) 


226 


149 


050 

f 


(366) 


224 
% 


149 


056 


(367) 


All Fields 


(9375) 


196 


187 » 


052 


(363) 


204 


201 


049 


(377) 



Njltc : Entries are standard partial regression (beta) weigh ts^pr^tnult iple 
correlation coefficients without declioals* 

•\ 

Q*. A* are raw unequated tbtal Scores, z-scaled by form. 
.V, A are GRE scaled scores, fully* equated across forms. 
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Acadcaic parforaane* in fielda that vmxj widtly in tipmvent verbal 
eaphacia* \ 

o Kajors i^ verbal field*' (end to jierf om better on VO than oo KT^^^le 
■ajora in quant itative*fieldi tend to fierfor« better on ftC than VO (with 
the anoaaioue exception of aatheBaticB (aee Figure 1 and related 
diecuaaion). - „ ' *, * 

« 

With respect to the' quantitative ability aetaure—' "'"^ 

o Oata interpretation (DI> iteaa appeal^ to be contributing okly slikhtlC 
to ov«trall predictive validity. 

o Eeij^uiar aatheaatica im) it«a« aay be particularly predictive, of 
perf<traance in vatheaatica Chypothatically, beea^ae of a graater deglree 
of overlap between teat content and corrieular content for aa^heaatica 
aajars than for otl»ira}, 

o. Soth EM and quantitative eoi^riaona (QC) itema .appear to lb* 
contributing to prediction, though not neceaaarily equally «o, in field^ 
that differ widely in apparent quantitative eaphaaia. 

o Majors in verbal fie Ida (for exaaple, hiatory, Engliah, political 
sci^ce) tend to perfora auch better on DI iteaa tl^aa oo^ other 
quantitative itea types, while the opposite is true for wajora in asth 
tod acieniee fields (for exaaple,' engineering, cheaietry, coaputer 
science, aatheaatics). 



mih respect to the analytical ability aeasure- 

o Based on their celativseontributioa to prediction of SR-UGPA,. logical 
reasoning (LR) iteaa ^pear to be underrepresented and analytical 
reasoning (ASL) iteas overrepreaented in the current i2-ite« to 3a-itea, 
LR to AS, rati^o in the analytical ability aeaaure. The sboi^er LR 
subtest appears to be as valid as the longer AS eubtest. 

o Analytical reasoning iteiu behave aore like qwuatitative ability ttewm 
while logical reasoning J.teas behave aore like verbal ability 
iteaa — they say prove to be useful extensions of the two basic ability 
measures* 

o Majors in verbal fielda. perfora better on LR than on M, while the 
opposite is true for aaJoVs in quantitative fields; ranka of . fielda in 
terac of aean total aiaalytical ability^ acora differ considerably froa 
ranks based on AR and L& asams, and ranks b^sed on AE aeans differ froa 
ranks based on LR aean9«' ' 



Diecuaaion 



Findings regarding the 'G8£ vocabulary and reading coaprehension 
subtests teud to confira and extend findings baaed on parallel subtests of ^ 
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1 

^ ^ ^ . " ■ f 

the SAT verbal' me&suT^. re»«lt», ctoabined wich result** of factor 

studies indlc«Mng din/lnguttthable **ocab^iar^ and "reading eoaprehension" 

v«;rb«i y tactoru^^tioad fay it««fi\<lU4 fciioae in' the tiubt«6t« under 

con»id«»r«i:ion in thle study, suigjeMtH* botektiaHy ui^eful role tot VO and 

RC mi^Scosm^ defined for thm study. 

!to a priori ratlonsle^aii available for projecting parclcular patcem« 
ot vaUdity lor itea-eype part «cor«''8„on the quantitative and ansiytlcal 
4.bili cy ' ffij^aurcM. Kovwiver, rei|Mit£/j»fil^««ted that the respective part 
tiQoreH at<& m^intritig tiomvhmt diffner»nt asp^^cts oi quantitative aod 



anaiytMcaU reasoning ability. - ila^ed<^iobs«rvc»d patterns of vaiidity 
coet fmid^ts for quantitative subteffll'^si^ci^'SWrage scores f<rr^i£fcirent 
at£i'orjil /he' components of quantitative ^ability beirig seasured by tbe data 
flpretation iteiss appe^g- to \m ditiereat froa those btsing EBfiasur^d by QC 
ana R>!. This is consistent iidth the resttlts of a factor analysis at {ss^ll 
yets of Iceos from the 1977 CRK Aptitude Test (Power« & Swinton, 19^0 in 
which 01 'ite« ^e£s helped to define a varieax factor celled data 
interpretation and technical cOttprehenslon" along with itees from technical 
reading passages and Itees froa the 1977 version of tAe analytical ability 
oKi'agare th**t seeded ssimi lar to the technic3Li_>i^adlng passaiges in content 
rind style. In a f^jctor analysis (RoeJi, WeruH^ & Cra^dy, 1982) that 
^^^rf^olved incercorrelati^s of iteai-type paVt scores |>i|ralieiing those 
c'ajployed in "this sttHlv*, the loading of. the 01. it etas on the quantitative 
lac Cor was less than the S|»ading8 for QC and items. \ 

Ttie uniquely high predictive valilll«:y of regular msthematics^ itesss^^r^ 
raatheB^t^cs majors, and evidence of differential validity for QC and 
Ittimip arrq;ss ' fields sugggest the' potential for iaiproved a^sessoent in 
sep.irace consideration of the quantitative* Item types. 

With respect to the analytical ability ^anur^^ perhap^^ the mo&t 
iuirl^uing aspect ^ the f Indl'ngs that have been reviewed {Sk} the raCher 
pet :iistent Indication that AK ' it ess tend to exhibit ""qudnititacive" ^ 
characteristics -while LR lte®s,tend to exhibjl t . "vnrbai" characterf:s£ics^ " 
i»uf (b) that '^Wt terns may te«Hi to be more vali^ thmn AR iteoB* Powers and 
Swinton (1981) found that logical reasoning items included tn -the 19^77 
vufsion of the anaiyttcai ability ma^Mt^ were highly related ta a reading* 
comprehension tar tor • Ahd^ with regarii to the comparative validity of the 
Ik and AR item types. Wild, Swinton, and Wailmark (l^SZ, T«ble 22) reported 
Ouii a subtest containing a 74 percent to 26 percent oii)c of 19 analytical 
and io^icai treasonlng items was lesa closely related to SR-UGVA than a 
subtest ot the satoe length than included onlj^ logical reaaonitig iteo^ (tor 
'xampie, r « .204 for the AR/LR combination vs SR-UGPA as compared to r - 

tor a l9-:item subtest including only logical reasoning Items). 
'^MfjMnlMg these t^ iitriti typett in a single acore would appear ^ bliint 
th**lr predictive ef lert 1 veness (see Table 9 and related di^i ui^a ion ) ; 
mureuver, the findings raise qaestloni^ regarding the desirability of 
inrluding more AR than LR items in :he analytical ability Mfaaure since the 
lo»;ical reasoning items appear tc have greater cr lteriaii--related validity 
than the analytical reasoning itema^ 

The results that have been reviewed point up the value oi evidence 
rrfgardlng the criterion-related validity of itea types within the more 
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geaer«l verbai, qu«ntiC«tive , jmd aaalytic^l ability accsuras. Such 
evideace conid heipftil Ca) •» « factor to be corjildered ic deteraiaing 
Che mix of it«»fi In « g£y«a cbllity Masure— for example, is d^ci^ions 
regarding tbe proportiooal- six &£ existing it<» types or decisions to add or 
eii«inate particular itsa types and (b) i^ assessMnts of construct 
validity— for sxaaple, as suppleaentary to the findings of factor analysis. 
Using data available in G£e files it would be feasible to develop, and 
update periodically, basic correlational results for all fields based on 
pooled departaental data for saaples of enrolled undergraduates.* * 



0^ 



♦Previous studies employing, SR-UGPA as an external acade&ic criterioD (for 
example. Killer & Wild, 1979; Wild, Swlnton, & Wsllaark, 1982; Goodison & 
Wild, 1982) have been based on total correlation aatrices (that is, 
test'UGPA correlations were coaputed for saaples that were heterogeneous 
with respect to undergraduate departoent, even though hooogeneous vith 
reepect to, say, broad graduate aajor areas). The direction and extent of 
covariation atnong aems of departaents on the GEE and SK-OGPA variables 
are not predictable — differences in aean SR-UGPA by departaent cannot be 
ass'^aed to reflect differences in level of undergraduate performance. 
Accordingly, the interpretation of analyses based on total correlation 
matrices is coaplicated by the fact that such astrlces include the 
theoretically unpredictable saong-aeans . covariances as well as the 
wlthln-departaent covariances (see Appeodic C). 
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Suppieuentary B^ta &n GEE Part Scores 



Six foni^ ot thif, GK£ Ceoeriil Test i^re administered between October 
i9BI and September I982« table k nhovii (a) the nimber ot: eUi&mliiee^ in the 
iaaiple taking each tdra and -ti^lr dintributdoo by sex and (b) oeana and 
standard deviations ot scores on selected test variables^ namely, the 
verbal « quantitative , and analytical scaled 4j;ores, equated across torss^ 
and the raw scor .s on the various Icen-cype subtests. The latter ^tre not 
equated across foriss, and the averaige difficulty leve! of the Items saking 
up each subtest say vfary itfithin tests for a given for® as ii/eil as across 

Based on the CHE scaled total scores^i eicassinees who took forms used in 
the first three Hdmlnlstrations were so»ewhat nsore ^ble than thosie ii^ho took 
t?ie three for»s .used tn the last two ad»ini8trations« Males constituted a 
majority of exa&in«*es taking certain forms » while fesidles constituted a 
majority oi exajwinees taking other forms (for exaaple » in April and June) « 
A majority of all fdembers of the study sample were female* 

It aiay be determined that the part-sc.^re means do not covary 
couiiliitent ly with the scaled total score means although a tendency toward 
positive covariation across forms between raw part scores and total scaled 
Stores is discern! bit. Data not tabled indicated that the raw total scores 
on the respective testr covaried closely with the total scaled scores. 

In i-scal Ing all raw scores by test form, using aeans and standard 

vitwiations for ail examin es taking each form regardless of administration 
dat* , It was assumed (a) that thei'e would be attenuating effects on the 
f i ii 1 1 ttbh I p *j1 the ic-scaled scores to SR-^UGPA associated with lack ot 
i ^ti,4ting, but ( b) that those ei tects would be random with respect to item 
t >pt*N «tcru-iH torms , and thus ^ c) that the relative weighting of particular 
ItiJB ty pe^. woyld nut be influenced by 4sny systematic biasing effect* 
' vidvfjce suggesting t h<it thene assumpt ions were generally valid is provided 




Table &. Heans and Standard Deviations of Raw Part and Converted Total Scores for Selected GRE 
General Test JukeTu During 1981-«2, by Test Form and Adwlnistration nates 



Form 


JlXTRl 


«^ ^ & 


3nnR3 


Males 


1584 


nu 




• ***** ^ o 


1497 


A ^ " 






3103 

» 






Admins 




Oct -Pec -Feb 


Oct>-F<ib 




mm 


t>i|jai 


















lt..3«OI 


































mmt* t 










1 ».§f 






aAU inf 




#aTiai 


ft. Iff f 








ii.iu^ * 



































3mni 

«21 
1116 
205! 

Apr 
if f» 21*0 



K-3I>GR3 

170 
206 

378 

AV>r 
mm 



3EGR2 

. 150 
209 

360 

June 











*i 






«a 






l«« 


















ffa 






t2« 






§a 




audits 


t«a 


















Mil 











All Forma 

458$ 
4715 

9375 

Total 



If « lift 

ffi#tf2f 









St£llilfK$ 










fa MM 






la 17^1 








%tm •CO* 




?an«? 












9ams« 
















*aW%f 








OUMr. € 










use*. Iff. 




$»«Mft 


laK^>K 






iatr%t 








M«ia tta 






9af«^f 




























ItfaMlf 








lMaltf« 


llialItT 


f l*a)^Mf 





ta^lO 

9a«e*i 

Sata%^ 

Sf^afflli 
llf^afflT 



lal^U 

tots 
iam« 

ltl#IQ?f 

ifr# ieM92 



* Includes individuals not identifiable by sex 
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Appendix B 

* Co«par«biiicy of P«rt-Scor45 Validity Profiiee for Single Foris 

aad Nuieiple Form UiiiaNe{u^t«i4 Scare SampleMi 

Regren^iaa linniyscii tmeq^£e4 totml verbal,, quaotilative, 4nd 

aniiiytlcal seoran ttom uix dlft^rignc forms of the General Test and analyate 
eiaployiog che three GEE acaied Cotai i^corea^ respectively yielded entire I? 
caisparable reaults C^ee teKt» Tabi^ an^d related diacua^io.i). The 
reUtlMH i#elghting ol the three total scoretui naa coneisteot acrons analyses. 

expected, the level of correlatioQ vaa higher for the equated total 
scores than for the unequ^ted total scores^ daei. it is asau&ed* to errors 
asfiociiited yith lack of eqoating across test tors^^ 

Parallel analyses eaploylag equa&ed and unequated part scores were not 
tea.s4bie. ^However » intercorrelation ttst rices yere generated for escasiinees 
taking single fotm of Che test. the pattern of correlations oi part 
ficores with SS^-UGPA in this sample eaay be compared with that for examinees 
taking several forwa^ vith uneqiiated scores « by reference to Figure 8-1 • 

The part -score correlational profiles for the single-'forttn and 
:3u 1 1 i ple^tor© saoipies are quite similar « but the level ot test-criterion 
cutrelations tends to be higher in the aingle-fora sample than in the 
mui ttple-for© sample. - These results suggest strongly that conclusions 
regarding the Illative rlterion-related validity pt various item-'type part 
HioreN, based on the ladings of the present study eiaployfng unequated 
S',:*yrt'*>i ^ Would he appl 1 cable tot equated part scores. 
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i^ipendix C 

Factors Involved in the Use cf Xotal vs Pooled Vithin-Group 
Gorreiationt In Validation Seaearch 

All regression analyses in this study eaployed 'pooled within-departoent 
correlation tutrices. All variables were z-scaled within each departaent 
before pooling. Other research, ciaploying the 6elf*-reported UGPA as an 
academic criterion (for exaapXe, Miller & Wild, 1979; Wildr Swinton, & 
Wallsark, 1982; Goodison & Wild, 1982) has used total correlation oat rices 
in which coefficients t^re based on data for all> individuals in' 
departsentally heterogeneous sanples. 

Such total sasple correlations are difficult to Interpret ^cauae It 
cannot be aasuoied that differences la mean GPA across several departoeats 
represent substantive differences in achievement — ^the nature of the at^ng 
ffieanii * correlation betii^en GRE scores and GPA across several departments Is 
theoretically unpredictable since it is influenced by arbitrary differences 
in grading standards aoong departssents. 

Exhibit C«l provides scatterplots of GRE ves'bal (or qu^titative) c^ar 
and first-year graduate GPA means for samples of students froa graduate 
departments that participated in a atudy of the 1977 restructured GR£ 
(kjnerai Tetit (Wilson, 1982)* • " ' ^ 

o Overall, the scatterplot of GRE quantitative and GPA o^ans for 
chenistry, computer science » economics , and matheiiatics departments 
(upper portion of the exhibit) suggests a low positive correlation 'among 
departmental means* ' * - 

o In Che lover portion of the exhibit, it may be seen that*^ ^aoiong edu- 
cation departments, there is a clear tendency for mean graduate GPA to 
vary Inversely with mean GRE score, while, for the English depa^tsients, 
the scatter of means suggests a generally posltiv^^ curvilinear 
relationship. ' 

V 

The trends illustrated in the exhibit are consisjUent with the 
proposition that neither the degree not the direction of coiMration between 
departmental CR£ and GPA means can' be assumed to follow a predictable 
pattern. Moreover, It is reasocable to infer that the total GRE --GPA 
rorrtflat tons for education ^nd English majors would differ even though the 
pooled within-group (wlthin-department) correlations were identical* If 
such were the case, the total CRE-^CPA correlation should be higher for the 
English sample (with positive asK^ng-^roans correlation) than for the' 
education sample (with negative among-w^ans'^ correlation). 

\fsing data from the present study, total correlations between SR-UCPA 
and the respective CS&E it ear type part scores (prior to wlthin-department- 
standardisation) were computed to provide a basis for comparison with the^ 
pooled withln-departaient correlations actually used in the study « 
Illustrative findings are summarized in Figure C*l. Kote» for exam]ile, that 
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in thm dues for education, and the coabiaed agriculture and bioiog»y 
8aQiplet^ the total correlation is syateaatically loi^r than the pop lad 
vithin^'departwint correlation while the* oppoiite tends to be true tor the 

verbal sdbAplti* . . ' 

The use d£ total rati^r chaa tfithin-'group correlations In the present 
study probably would have led to soiMwhat ^different outcoaes. Concluiions 
reisardiog the relative level of validity oS particular subtests for various* 
disciplines would have been affected, for eiuuBple* It is not clear whether 
or how outcosies bearing on the relative contribution of the vai^ious subtests 
to prediction of the SR-IKZPA criterion aight bsve been affected. 

Strictly speaking, it would seea^that the aost rigorously designed 
studies of- GRE correlations with SS-UGPA would call for the use of pooled, 
withln-departoent;^ astrices. In validation research involving CPA criteria, 
the use of total correlations in departaentally heterogenous saaipies 
Involves eleae'ots of interpretive asbiguity that can be avoided only by 
using pooled wlthio-group correlations. 
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